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CLAIMS 

What is claimed is: 

1. A system comprising: 

a lexical tree having a plurality of nodes, wherein an input speech is 
processed by propagating tokens along a plurality of different paths within the 
lexical tree, each token containing information relating to a probability score 
and a word path history; 

a buffer having a plurality of entries; and 

a merging task (1) to access a token list containing a group of tokens that 
have propagated to current state from a plurality of transition states, (2) to 
place tokens into an appropriate entry in said buffer according to a hash value 
and (3) to merge tokens with the same word path history to form a merged 
token list. 



2. The system of claim 1, further comprising a long-span M-gram 
language model integrated into the system. 

3. The system of claim 2, wherein said long-span language model is a 
tri-gram based language model. 

4. The system of claim 2, wherein said M is greater than three. 

5. The system of claim 1, wherein said hash value of a token is 
computed based on a word path history associated with said token. 

6. The system of claim 5, wherein said hash value associated with a 
particular token is calculated as follows: 

L - a(l)W(l) + a(2)W(2) + a(3)W(3) 
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where W(l) represents a word index number associated with the first 
word in the word path history; 

W(2) represents a word index number associated with the second word 
in the word path history; 

W(3) represents a word index number associated with the third word in 
the word path history; and 

a(l), a(2), a(3) are individually assigned to a constant number. 

7. The system of claim 1, wherein said merging task calculates a new 
hash value for a token in the event the buffer entry associated with the 
previous hash value contains another token with different word path history. 

8. A method comprising: 

passing tokens through a transition network configured to represent 
search paths for decoding an input speech; 

accessing a token list containing a group of tokens that have propagated 
to current state from a plurality of transition states, each token in the token list 
containing information relating to a word path history and a probability score; 

calculating a hash value for each token in said token list; and 

merging tokens with same word path history according to said hash 

value. 

9. The method of claim 8, further comprising integrating long-span M- 
gram language model in a speech recognition system. 

10. The method of claim 8, wherein said long-span language model is a 
tri-gram based language model. 

11. The method of claim 8, wherein said hash value of a particular token 
is computed based on said word path history associated with said token. 
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12. The method of claim 8, wherein said merging tokens comprises: 
placing tokens into an appropriate entry in a buffer according to said 

hash value; 

if the entry in the buffer associated with said hash value is occupied, 
determining if a word path history associated with the token residing therein 
matches a word path history associated with a current token; and 

if the word path history of the preexisting token and the current token 
are the same, retaining one of the tokens with the higher probability score and 
discarding the other token. 

13. The method of claim 8, further comprising computing a new hash 
value for a token in the event the buffer entry associated with the previous 
hash value is occupied by another token with different word path history. 

14. The method of claim 13, wherein said new hash value is computed 
based on a collision principle to ensure that a subsequent token with the same 
word path history will go through the hash table in a proper order and be 
assigned to the same new index number. 

15. A machine-readable medium that provides instructions, which 
when executed by a processor cause said processor to perform operations 
comprising: 

accessing a token list containing a group of tokens that have propagated 
to current state from a plurality of transition states, each token in the token list 
containing information relating to a word path history and a probability score; 
calculating a hash value for each token in said token list; and 
merging tokens with same word path history according to said hash 

value. 
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16. The machine-readable medium of claim 15, wherein said hash value 
of a particular token is computed based on said word path history associated 
with said token. 

17. The machine-readable medium of claim 15, wherein said operation 
of merging tokens comprises: 

placing tokens into an appropriate entry in a buffer according to said 
hash value; 

if the entry in the buffer associated with said hash value is occupied, 
determining if a word path history associated with the token residing therein 
matches a word path history associated with a current token; and 

if the word path history of the preexisting token and the current token 
are the same, retaining one of the tokens with the higher probability score and 
discarding the other token. 

18. The machine-readable medium of claim 15, wherein said operation 
further comprises computing a new hash value for a token in the event the 
buffer entry associated with the previous hash value is occupied by another 
token with different word path history. 

19. The machine-readable medium of claim 18, wherein said new hash 
value is computed based on a collision principle to ensure that a subsequent 
token with the same word path history will go through the hash table in a 
proper order and be assigned to the same new index number. 
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